549 research outputs found

    Optimal paths for avoiding a radiating source

    Get PDF
    We consider the problem of navigating between points in the plane so as to minimize the exposure to a radiating source. Specifically, given two points z_1, z_2 in the complex plane, we solve the problem of finding the path C(t) (0 ≤ t ≤ 1) such that C(0)=z_1, C(1)=z_2 and ∫^1_0 |C'(t)|/|C(t)|^k dt is minimized. The parameter k specializes to a number of interesting cases: in particular k=2 pertains to the passive sensor avoidance problem and k=4 entails the active radar avoidance problem. The avoidance paths which minimize exposure may have infinite arc-length. To overcome this problem we introduce a weighted exposure and path length optimization problem whose solution requires a variational approach. The optimal trajectory results we obtain are surprisingly intuitive in the cases of interest

    Tropical Geometry of Statistical Models

    Get PDF
    This paper presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sum-product algorithm is an efficient tool for evaluating specific coordinates. The question addressed here is how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. A key role is played by the Newton polytope of a statistical model. Our results are applied to the hidden Markov model and to the general Markov model on a binary tree.Comment: 14 pages, 3 figures. Major revision. Applications now in companion paper, "Parametric Inference for Biological Sequence Analysis

    Parametric Inference for Biological Sequence Analysis

    Get PDF
    One of the major successes in computational biology has been the unification, using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied towards these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems associated with different statistical models. This paper introduces the \emph{polytope propagation algorithm} for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.Comment: 15 pages, 4 figures. See also companion paper "Tropical Geometry of Statistical Models" (q-bio.QM/0311009

    Pseudoalignment for metagenomic read assignment

    Get PDF
    Motivation: Read assignment is an important first step in many metagenomic analysis workflows, providing the basis for identification and quantification of species. However ambiguity among the sequences of many strains makes it difficult to assign reads at the lowest level of taxonomy, and reads are typically assigned to taxonomic levels where they are unambiguous. We explore connections between metagenomic read assignment and the quantification of transcripts from RNA-Seq data in order to develop novel methods for rapid and accurate quantification of metagenomic strains. Results: We find that the recent idea of pseudoalignment introduced in the RNA-Seq context is highly applicable in the metagenomics setting. When coupled with the Expectation-Maximization (EM) algorithm, reads can be assigned far more accurately and quickly than is currently possible with state of the art software, making it possible and practical for the first time to analyze abundances of individual genomes in metagenomics projects

    Expression reflects population structure

    Get PDF
    Population structure in genotype data has been extensively studied, and is revealed by looking at the principal components of the genotype matrix. However, no similar analysis of population structure in gene expression data has been conducted, in part because a naïve principal components analysis of the gene expression matrix does not cluster by population. We identify a linear projection that reveals population structure in gene expression data. Our approach relies on the coupling of the principal components of genotype to the principal components of gene expression via canonical correlation analysis. Our method is able to determine the significance of the variance in the canonical correlation projection explained by each gene. We identify 3,571 significant genes, only 837 of which had been previously reported to have an associated eQTL in the GEUVADIS results. We show that our projections are not primarily driven by differences in allele frequency at known cis-eQTLs and that similar projections can be recovered using only several hundred randomly selected genes and SNPs. Finally, we present preliminary work on the consequences for eQTL analysis. We observe that using our projection co-ordinates as covariates results in the discovery of slightly fewer genes with eQTLs, but that these genes replicate in GTEx matched tissue at a slightly higher rate

    The m−m-dissimilarity map and representation theory of SLmSL_m

    Get PDF
    We give another proof that mm-dissimilarity vectors of weighted trees are points on the tropical Grassmanian, as conjectured by Cools, and proved by Giraldo in response to a question of Sturmfels and Pachter. We accomplish this by relating mm-dissimilarity vectors to the representation theory of SLm.SL_m.Comment: 11 pages, 8 figure

    Solving the 100 Swiss Francs Problem

    Full text link
    Sturmfels offered 100 Swiss Francs in 2005 to a conjecture, which deals with a special case of the maximum likelihood estimation for a latent class model. This paper confirms the conjecture positively

    Improving RNA-Seq expression estimates by correcting for fragment bias

    Get PDF
    The biochemistry of RNA-Seq library preparation results in cDNA fragments that are not uniformly distributed within the transcripts they represent. This non-uniformity must be accounted for when estimating expression levels, and we show how to perform the needed corrections using a likelihood based approach. We find improvements in expression estimates as measured by correlation with independently performed qRT-PCR and show that correction of bias leads to improved replicability of results across libraries and sequencing technologies
    • …
    corecore